90 research outputs found

    Kinship Verification from Videos using Spatio-Temporal Texture Features and Deep Learning

    Full text link
    Automatic kinship verification using facial images is a relatively new and challenging research problem in computer vision. It consists in automatically predicting whether two persons have a biological kin relation by examining their facial attributes. While most of the existing works extract shallow handcrafted features from still face images, we approach this problem from spatio-temporal point of view and explore the use of both shallow texture features and deep features for characterizing faces. Promising results, especially those of deep features, are obtained on the benchmark UvA-NEMO Smile database. Our extensive experiments also show the superiority of using videos over still images, hence pointing out the important role of facial dynamics in kinship verification. Furthermore, the fusion of the two types of features (i.e. shallow spatio-temporal texture features and deep features) shows significant performance improvements compared to state-of-the-art methods.Comment: 7 page

    A Cyberpunk 2077 perspective on the prediction and understanding of future technology

    Full text link
    Science fiction and video games have long served as valuable tools for envisioning and inspiring future technological advancements. This position paper investigates the potential of Cyberpunk 2077, a popular science fiction video game, to shed light on the future of technology, particularly in the areas of artificial intelligence, edge computing, augmented humans, and biotechnology. By analyzing the game's portrayal of these technologies and their implications, we aim to understand the possibilities and challenges that lie ahead. We discuss key themes such as neurolink and brain-computer interfaces, multimodal recording systems, virtual and simulated reality, digital representation of the physical world, augmented and AI-based home appliances, smart clothing, and autonomous vehicles. The paper highlights the importance of designing technologies that can coexist with existing preferences and systems, considering the uneven adoption of new technologies. Through this exploration, we emphasize the potential of science fiction and video games like Cyberpunk 2077 as tools for guiding future technological advancements and shaping public perception of emerging innovations.Comment: 12 pages, 7 figure

    Journal of Real-Time Image Processing manuscript No. (will be inserted by the editor) Evaluation of real-time LBP computing in multiple architectures

    Get PDF
    Abstract Local Binary Pattern (LBP) is a texture operator that is used in several different computer vision applications requiring, in many cases, real-time operation in multiple computing platforms. The irruption of new video standards has increased the typical resolutions and frame rates, which need considerable computational performance. Since LBP is essentially a pixel operator that scales with image size, typical straightforward implementations are usually insufficient to meet these requirements. To identify the solutions that maximize the performance of the real-time LBP extraction, we compare a series different implementations in terms of computational performance and energy efficiency while analyzing the different optimizations that can be made to reach real-time performance on multiple platforms and their different available computing resources. Our contribution addresses the extensive survey of LBP implementations in different platforms that can be found in the literature. To provide for a more complete evaluation, we have implemented the LBP algorithms in several platforms such as Graphics Processing Units, mobile processors and a hybrid programming model image coprocessor. We have extended the evaluation of some of the solutions that can be found in previous work. In addition, we publish the source code of our implementations

    Diseño y ensayo de férulas personalizadas mediante impresión 3D.

    Get PDF
    En este proyecto se trata de llevar un paso más a allá el modelo actual de férula utilizado actualmente cuando se produce una fractura, por ello intentamos sustituir el material utilizado, yeso París, el cual se usa fundamentalmente ya que es un material con un bajo precio y una alta adaptabilidad a la forma de la extremidad en la que la queremos aplicar y en la que se ha producido la rotura. Debido a estas cualidades pensamos que los materiales imprimidos en 3D pueden suplir bastante bien estas características, ya que los elementos fabricados mediante este proceso tienen un muy bajo coste siendo el material fundamental utilizado el plástico PLA con el cual podríamos conseguir una rigidez similar a la del yeso una vez endurecido, con todas estas premisas se va a proceder a realizar una serie de ensayos y de iteraciones buscando encontrar una férula imprimida en 3D de PLA que se adapte lo mejor posible a la extremidad deseada y que tenga una resistencia suficiente para fijar el hueso, además el uso de este material para órtesis también presenta una serie de ventajas tales como una mayor ligereza y facilidad de movimiento por parte del usuario, ya que el plástico se puede mojar, permite una mayor movilidad además de otra series de características que iremos viendo a lo largo del proyecto.Universidad de Sevilla. Máster Universitario en Ingeniería Industria

    MAMAF-Net: Motion-Aware and Multi-Attention Fusion Network for Stroke Diagnosis

    Full text link
    Stroke is a major cause of mortality and disability worldwide from which one in four people are in danger of incurring in their lifetime. The pre-hospital stroke assessment plays a vital role in identifying stroke patients accurately to accelerate further examination and treatment in hospitals. Accordingly, the National Institutes of Health Stroke Scale (NIHSS), Cincinnati Pre-hospital Stroke Scale (CPSS) and Face Arm Speed Time (F.A.S.T.) are globally known tests for stroke assessment. However, the validity of these tests is skeptical in the absence of neurologists. Therefore, in this study, we propose a motion-aware and multi-attention fusion network (MAMAF-Net) that can detect stroke from multimodal examination videos. Contrary to other studies on stroke detection from video analysis, our study for the first time proposes an end-to-end solution from multiple video recordings of each subject with a dataset encapsulating stroke, transient ischemic attack (TIA), and healthy controls. The proposed MAMAF-Net consists of motion-aware modules to sense the mobility of patients, attention modules to fuse the multi-input video data, and 3D convolutional layers to perform diagnosis from the attention-based extracted features. Experimental results over the collected StrokeDATA dataset show that the proposed MAMAF-Net achieves a successful detection of stroke with 93.62% sensitivity and 95.33% AUC score

    Improving Depression estimation from facial videos with face alignment, training optimization and scheduling

    Full text link
    Deep learning models have shown promising results in recognizing depressive states using video-based facial expressions. While successful models typically leverage using 3D-CNNs or video distillation techniques, the different use of pretraining, data augmentation, preprocessing, and optimization techniques across experiments makes it difficult to make fair architectural comparisons. We propose instead to enhance two simple models based on ResNet-50 that use only static spatial information by using two specific face alignment methods and improved data augmentation, optimization, and scheduling techniques. Our extensive experiments on benchmark datasets obtain similar results to sophisticated spatio-temporal models for single streams, while the score-level fusion of two different streams outperforms state-of-the-art methods. Our findings suggest that specific modifications in the preprocessing and training process result in noticeable differences in the performance of the models and could hide the actual originally attributed to the use of different neural network architectures.Comment: 5 page

    Audio-Based Classification of Respiratory Diseases using Advanced Signal Processing and Machine Learning for Assistive Diagnosis Support

    Full text link
    In global healthcare, respiratory diseases are a leading cause of mortality, underscoring the need for rapid and accurate diagnostics. To advance rapid screening techniques via auscultation, our research focuses on employing one of the largest publicly available medical database of respiratory sounds to train multiple machine learning models able to classify different health conditions. Our method combines Empirical Mode Decomposition (EMD) and spectral analysis to extract physiologically relevant biosignals from acoustic data, closely tied to cardiovascular and respiratory patterns, making our approach apart in its departure from conventional audio feature extraction practices. We use Power Spectral Density analysis and filtering techniques to select Intrinsic Mode Functions (IMFs) strongly correlated with underlying physiological phenomena. These biosignals undergo a comprehensive feature extraction process for predictive modeling. Initially, we deploy a binary classification model that demonstrates a balanced accuracy of 87% in distinguishing between healthy and diseased individuals. Subsequently, we employ a six-class classification model that achieves a balanced accuracy of 72% in diagnosing specific respiratory conditions like pneumonia and chronic obstructive pulmonary disease (COPD). For the first time, we also introduce regression models that estimate age and body mass index (BMI) based solely on acoustic data, as well as a model for gender classification. Our findings underscore the potential of this approach to significantly enhance assistive and remote diagnostic capabilities.Comment: 5 pages, 2 figures, 3 tables, Conference pape

    Natural course of septo-optic dysplasia: Retrospective analysis of 20 cases

    Full text link
    Introducción. La displasia septoóptica (DSO) es la combinación variable de signos de disgenesia de línea media cerebral, hipoplasia de nervios ópticos y disfunción hipotálamo-hipofisaria, asociándose, a veces, con un espectro variado de malformaciones de la corteza cerebral. Objetivo. Describir la evolución natural y los hallazgos de neuroimagen en una serie de 20 pacientes diagnosticados. Pacientes y métodos. Se revisan de forma retrospectiva las características epidemiológicas, clínicas y neurroradiológicas de 20 pacientes consecutivos diagnosticados de DSO entre enero de 1985 y enero de 2010. Se analizaron los datos de tomografía computarizada, resonancia magnética craneal, electroencefalograma, potenciales evocados visuales, valoración oftalmológica, cariotipo y estudio endocrinológico. En siete pacientes, se realizó estudio del gen Homeobox HESX1. Resultados. El 60% de los casos presentaba antecedentes patológicos en el primer trimestre de gestación, con las ecografías fetales normales. Clínicamente, destacaban manifestaciones visuales (85%), alteraciones endocrinas (50%), retraso mental (60%) y crisis epilépticas (55%). Un 55% se asociaba a anomalías de migración neuronal. En un 45%, la DSO era el único hallazgo de neuroimagen. Se realizó cariotipo a todos, siendo normal. El gen HESX1 fue positivo en dos de los siete casos estudiados (ambos con DSO aislada). Ninguno con mutación en el gen HESX1 presentaba consanguinidad familiar. No se realizó estudio genético a los padres. Conclusiones. La DSO debe clasificarse como un síndrome malformativo heterogéneo, que asocia múltiples anomalías cerebrales, oculares, endocrinas y sistémicas. Las formas más graves se asocian con anomalías de la migración neuronal y de la organización cortical (AU)Introduction. Septo-optic dysplasia (SOD) is the variable combination of signs of dysgenesis of the midline of the brain, hypoplasia of the optic nerves and hypothalamus-pituitary dysfunction, which is sometimes associated with a varied spectrum of malformations of the cerebral cortex. Aims. To describe the natural history and neuroimaging findings in a series of 20 diagnosed patients. Patients and methods. We review the epidemiological, clinical and neuroimaging characteristics of 20 consecutive patients diagnosed with SOD between January 1985 and January 2010. Data obtained from computerised tomography, magnetic resonance imaging of the head, electroencephalogram, visual evoked potentials, ophthalmological evaluation, karyotyping and endocrinological studies were analysed. In seven patients, a study of the gene Homeobox HESX1 was conducted. Results. Pathological antecedents in the first three months of gestation were presented by 60% of the cases, with normal results in the foetal ultrasound scans. Clinically, the most striking features were visual manifestations (85%), endocrine disorders (50%), mental retardation (60%) and epileptic seizures (55%). Fifty-five per cent were associated to abnormal neuronal migration. In 45%, SOD was the only finding in the neuroimaging scans. Karyotyping was performed in all cases, the results being normal. Gene HESX1 was positive in two of the seven cases studied (both with isolated SOD). None of those with mutation in gene HESX1 presented familial consanguinity. No gene study was conducted with the parents. Conclusions. SOD must be classified as a heterogeneous malformation syndrome, which is associated to multiple brain, ocular, endocrine and systemic anomalies. The most severe forms are associated with abnormal neuronal migration and cortical organisation (AU

    Introducing VTT-ConIot: A Realistic Dataset for Activity Recognition of Construction Workers Using IMU Devices

    Get PDF
    Sustainable work aims at improving working conditions to allow workers to effectively extend their working life. In this context, occupational safety and well-being are major concerns, especially in labor-intensive fields, such as construction-related work. Internet of Things and wearable sensors provide for unobtrusive technology that could enhance safety using human activity recognition techniques, and has the potential of improving work conditions and health. However, the research community lacks commonly used standard datasets that provide for realistic and variating activities from multiple users. In this article, our contributions are threefold. First, we present VTT-ConIoT, a new publicly available dataset for the evaluation of HAR from inertial sensors in professional construction settings. The dataset, which contains data from 13 users and 16 different activities, is collected from three different wearable sensor locations.Second, we provide a benchmark baseline for human activity recognition that shows a classification accuracy of up to 89% for a six class setup and up to 78% for a sixteen class more granular one. Finally, we show an analysis of the representativity and usefulness of the dataset by comparing it with data collected in a pilot study made in a real construction environment with real workers
    corecore